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Abstract: An algorithm to count, or alternatively generate, all fc-element 
transversals of a set system is presented. For special cases it works in output- 
linear time. 
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j> ! 1 Introduction 

o 

CSj ■ Generating all minimal transversals of a hypergraph % based on a set W is a prominent 

research endeavour [EMG]. But also generating (and evaluating) all transversals of H 
may be required [W3]. Likewise, the focus may be on all k-element transversals for some 
integer k. For instance in [W2] they need to be counted (not generated) for k = 1 up to 
k = \W\. As to fixed cardinality constraints in general, see also [BEHM]. While [W2] and 
[W3] display particular applications of the so called transversal e-algorithm, the present 
paper harks back to [Wl] and provides additional theoretic results. 



Let us begin with a broader perspective and then zoom in onto transversals. Suppose 
that a\ up to ah denote "constraints" applying to subsets X of a finite set W. Many 
kinds of combinatorial objects X can be modelled as the sets X that satisfy h suitably 
O ■ chosen constraints. The principle of inclusion-exclusion states that 

h 

N{a x A • • • A a h ) = 2 W Nfc) + ^ N (®i A ± A • • • A a h ) , 

i=l l<i<j<h 

where N(aiA- ■ -/\a h ) is the number of X C W satisfying all constraints, and e.g. NfaAaj) 
is the number of X C W satisfying neither a« nor aj. Unfortunately 2 h terms need to be 
added or subtracted, and often it is cumbersome to compute the terms themselves. 

Enter the principle of exclusion (POE) which is discussed in detail in [Wl]. Its basic 
policy is simply to start with Modo = 2 W and exclude iteratively all sets ICf that fail 
to have property a%, 0,2, ■ ■ ■ , ah- Thus, writing X \= when X satisfies Oj, one has: 

Mod 2 Modi := {X G Mod : X |= 01} 2 Mod 2 := {X G Mod x : X (= a 2 } 

and so forth. Obviously Mod^ := {X G Modh_i : X \= ah} comprises exactly the 
X's that satisfy all constraints, and so N(ai A ■■■ A ah) = \Modh\- This seems like 
a naive approach but a compact way to pack the members of Modj (within so called 
multivalued rows) often makes it work. In the present article the combinatorial objects 
at stake are the transversals (or hitting sets) X of a given set system (= hypergraph) 
H = {Hi, H 2 , • • • , Hh} of subsets of W. Indeed, defining X |= eij by X H Hi 7^ unleashes 
the POE framework. 
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Here comes the section break up. A medium-size example is given in Section 2, and 
endowed with theory in Section 4. (Section 3 is discussed in a moment.) Specifically, 
transversals can be viewed as models of a (dual) Horn formula, and hence some facts of 
[Wl] will carry over, but simplify and fortify in the process. This is done in Theorem 
4 which exclusively targets fixed cardinality transversals, be it counting or generating. 
Under quite natural side conditions that can be done in output-linear time. 

Whereas [Wl] concentrates on how mentioned multivalued rows reproduce, in Section 3 
of the present article we focus on individual multivalued rows r and how the /c-element 
sets contained in r can be counted or generated efficiently We also give the asymptotic 
number of length n multivalued rows as n goes to oo. Parts of Section 4 depend on Section 
3. Section 5 briefly points out the pros and cons of POE as compared to binary decision 
diagrams. 

For positive integers w we put [w] := {1, 2, • • • , w}. 



2 The transversal e-algorithm by example 

Consider the (14, 6)-hypergraph with vertex set W = [14] and set H = {Hi, • • • , H 6 } of 
hyperedges defined by 

#! = {3,4,9}, H 2 = {5, 10}, H 3 = {6, 7, 11, 12}, H 4 = {8, 13, 14}, 

H 5 = {1, 2, 3, 4, 5, 6, 7, 8}, H 6 = {3, 4, 5, 8, 12, 13}. 

As alluded to in the introduction, starting with the powerset Mod := 2 W we filter out 
the family Mod x C Mod of all X e Mod with In^ ^ 0. Then we filter out the family 
Mod 2 C Modi of all X e Modi with X n H 2 ^ 0, and so forth. After having processed 
H h (h — 6), the family Mod 6 obviously consists of all transversals of H. 

Under the transversal e-algorithm (or briefly e-algorithm) each set X in Mod will be 
identified with its characteristic 0, 1-vector of length 14. But whenever possible we use 
the label 2 to indicate that an entry is allowed to be either or 1. Thus the powerset is 
written as Mod = (2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2). Actually, it is more precise to write 
Mod = {(2, •••,2)}. Similarly set Modi = {(2, 2, e, e, 2, 2, 2, 2, e, 2, 2, 2, 2, 2)} because 
Hi = {3, 4, 9} and a string of symbols e by definition means that only characteristic 
vectors X are allowed which have at least one 1 in a position occupied by an e. Similarly 
we obtain Mod2, Mods, Mod4, but of course we need to introduce subscripts to distinguish 
the three e-constraints. Thus Mod 4 = {r} where 

r := (2, 2, e x , e x , e 2 , e 3 , e 3 , e 4 , e x , e 2 , e 3 , e 3 , e 4 , e 4 ). 

So far so good, but it's going to be harder to get Mods because H 5 intersects four e- 
bubbles. As a starter, in view of H 5 n Hi = {3,4} let us split r into the disjoint union 
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of 

r[e] :={X Er: X n {3, 4} ^ 0} and r[0] := {X G r : X n {3, 4} = 0}. 
Using our new notation, 

r[e] = (2,2,e, e, e 2 , e 3 , e 3 , e 4 , 2, e 2 , e 3 , e 3 , e 4 , e 4 ), 

r[0] = (2,2, 0, 0,e 2 ,e 3 ,e 3 ,e 4 , l,e 2 ,e 3 ,e 3 ,e 4 ,e 4 ). 

Thus eieiei is split in ee2 and 001. All X G r[e] satisfy the fifth constraint since Xfli7 5 D 
X fl {3,4} 7^ 0. But some X G r[0] do not satisfy it. In order to exclude these X's and 
in view of H 5 n H 2 = {5}, we split r[0] into 

r[0,e] := (2, 2, 0, 0, 1, e 3 , e 3 , e 4 , 1, 2, e 3 , e 3 , e 4 , e 4 ), 

r[0,0] := (2, 2,0,0, 0,e 3 ,e 3 ,e 4 ,l,l,e 3 ,e 3 ,e 4 ,e 4 ). 

Now all X G r[0, e] satisfy Xni/5 7^ 0, but not all X G r[0, 0] satisfy this. Thus, similarly, 
we split r[0, 0] into r[0, 0, e] and r[0, 0, 0]. Then r[0, 0, 0] is split into r[0, 0, 0, e] and 

r[0, 0,0,0]' = (2,2,0,0,0,0,0,0,l,l,e 3 ,e 3 ,e 4 ,e 4 ). 

This row need not be split; the only sets X G r[0, 0, 0, 0]' satisfying X fl H 5 7^ are the 
ones with X fl {1, 2} 7^ 0. They are precisely the elements of 

r[0, 0,0,0] := (e, e, 0, 0, 0, 0, 0, 0, 1, 1, e 3 , e 3 , e 4 , e 4 ). 

Thus 

Mod 5 = {r[e], r[0,e], r[0,0,e], r[0,0,0,e], r[0, 0,0,0]}. 
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Table 1: Compact representation of a transversal hypergraph 



Let us process the rows of Mod 5 and sieve out in each row the X's that satisfy XC\Hq 7^ 0. 
All X G r[e] satisfy this constraint (because of ee at positions 3, 4), so we carry over r[e] 
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unaltered but relabel it r\. Ditto r[0,e] satisfies the sixth constraint (because of the 1 at 
position 5) and carries over alias r 2 . Let p := r[0, 0, e] and replace ee by e±ei for cosmetic 
reasons. Using obvious notation we have H$ n suppi^e^) = {8, 13}, and so we need to split 
p into 

p[e] = (2,2,0,0,0,ei,ei,e,l,l,2,2,e,2), 

p[0] = (2,2,0,0,0, ei , ei ,0,l,l,2,2,0,l). 

Row p[e] carries over alias r 3 . With obvious notation, twos(p) C\H 6 = {1, 2, 11, 12} HH 6 
0, and so row p[0] can change and survive as 

p[0;e] = (2,2,0,0,0, ei , ei ,0,l,l,2,l,0,l) (= r 4 ). 

As to r[0, 0, 0, e], all its members X satisfy X C\H 6 ^ and so r[0, 0, 0, e] carries over alias 
r 5 . But o" := r[0, 0, 0, 0] has H 6 fl supp(e 3 ) = {12} and needs to be split in 

a[e] = ( ei ,ei,0,0,0,0,0,0,l,l,2,l,e4,e 4 ), 

a[0] = (ei,ei,0,0,0,0,0,0,l,l,l,0,e4,e 4 ). 

Row a[e] carries over alias r 6 , but <r[0] in view of H 6 fl supp^e^) = {13} is further split 
into 

a[0,e\ = (ei,ei, 0,0, 0,0, 0,0, 1,1, 1,0, 1,2), 

a[0,0] = (ei,ei, 0,0, 0,0, 0,0, 1,1, 1,0, 0,1). 

Row <r[0, e] carries over alias r 7 , but <r[0, 0] is cancelled since HqHX = for all X e a[0, 0]. 
To summarize, Mod 6 := {r l5 • • • , r 7 } encodes all transversals of the set system "H. 

Due to the disjointness of rows the number N of transversals of "H, i.e. the sum of the 
cardinalities of the R = 7 final tows constituting Mode, is 

iV = 2 3 (2 2 -l)(2 2 -l)(2 4 -l)(2 3 -l) + 840 + 288 + 24 + 48 + 18 + 6 = 8784. 

This is fairly evident, and further formalized in Section 4. 

2.1 Another benefit of the e-formalism 

This ad hoc subsection fits in well but is not related to the remainder of the paper. 
Put W — [w]. Rather than Mod/i we shall henceforth write Tr("H) for the transversal 
hypergraph, i.e. for the family of all transversals of a hypergraph "H C 2 W . Fixing A C W 
we aim to find all X e Tr(H) with X C A. Dually we may wish to sieve all X G Tr(H) 
with AC X. Set W := {Hi f] A : Hi E U} and H" := {H, t E U : Hi n A = 0}. Then 

{X E Tr(U) : X C A} = Tr(U') 
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{XeTr(U): ACX} = {A U Y : Y eTr(H")} 

Suppose for 1000 sets Aj one has to solve one of these tasks (or variations thereof). Rather 
than running the e-algorithm 1000 times for varying 'H','H", it's better to run it once for 
H. The 1000 required set families are then easily obtained from Tr(7i). For instance, if 
H. = {Hi, • • • , Hq} is as above, then 

{X G Tr(H) : 7 £ X and {8, 9} C X} 

is the disjoint union of these four rows derived from r±, T2, r 3 , r§ in Table 1: 



1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


2 


2 


e 


e 


e 2 


e 3 





1 


1 


e 2 


e 3 


e 3 


2 


2 


2 


2 








1 


e 3 





1 


1 


2 


e 3 


e 3 


2 


2 


2 


2 











1 





1 


1 


1 


2 


2 


2 


2 


2 


2 

















1 


1 


1 


e 3 


e 3 


2 


2 



3 Individual {0, 1, 2, e}- valued rows 

Here we look at {0, 1, 2, e}-valued rows on their own. Thus the row splitting process we 
glimpsed at in Section 2, and the resulting interdependence of rows, will be postponed to 
Section 4. Subsection 3.1 gives the formal definition of a {0, 1, 2, e}- valued row r, along 
with the number f(w) of such rows of length w. In 3.2 and 3.3 we show how the fc-element 
sets within a fixed row can be counted, respectively generated. The special case k = k min 
deserves extra attention (3.4). 



3.1 Formal definition and number of {0, 1, 2, e}-valued rows 

Formally, a {0, 1, 2, e}-valued row on a finite set W is a quadruplet 

r := {zeros(r), ones(r), twos(r), ebubbles(r)} 

such that W is a disjoint union of the sets zeros(r), • • • , ebubbles(r), where any of these 
may be empty. Furthermore, if ebubbles(r) ^ then it is a union of t > 1 many sets 
ebi, ■ ■ ■ , eb t (called e-bubbles) such0 that Si := \ebi\ > 2 for all 1 < i < t. Thus r can be 
visualized (up to permutation of the entries) as 

(1) r = (0, ■ •• , 0, 1, ■ • • , 1, 2, ■ • • , 2, ei, ■ • ■ , e x , ■ ■ ■ , e t , • - , e t ) . 

a p 7 ei e t 

1 It has been observed that a 1 could be viewed as an e-bubble of length one. However, it's better to 
stick to the given definition and demand a length of at least two. We further note that multivalued means 
{0, 1, 2, e}-valued in the present article, but can have other meanings in other applications of the POE. 
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By definition, r represents the family of sets X C W satisfying 

(2) X n zeros(r) = and ones(r) C X and (VI < % < t) eh n X ^ 0. 

It is however convenient to identify r with the family of X's satisfying (2). Then, obviously, 

(3) |r| = V ■ (2 £1 - 1) ■ ••(2 et - 1). 

The Boolean lattice 2M has 2^ many subsets S. The {0, 1, 2, e}-valued rows of length 
w yield some of these S, but far from all. However, as we shall see, one gets vastly more 
sets S than with {0, 1, 2}-valued rows; the latter merely deliver the 3 W many intervals S 
of 2^1 So let us proceed to calculate the number f(w) of {0, 1, 2, e}-valued rows of length 
w. Let B C 2^ be any Boolean sublattice say with bottom and top elements _L, T E B 
and with atoms A\, A 2 , • • • , A s . Since Ai fl Aj =_L for i ^ j and A\ U A 2 U • • • U A s = T, 
it follows that the sets 

Ax\±, ■■■ ,A r \±, A y+1 \±, ■■■ ,A J+t \± 

partition T\ JL. Upon permutation we can assume that s = 7 + t and that for some 7 > 
the sets Aj\ _L are singletons for i < 7, and of higher cardinalities £1, • • • ,£4 otherwise. 
Hence £> matches a type (1) row r with 

zeros(r) = W\T, ones(r) = ±, twos(r) = (Ai\ ±) U • • - U (Ay\ _L), 
ebi = A J+ i\ ±, e6 2 = ^ 7 +2\ J_, up to eb t = A J+t \ _L . 

Vice versa, every {0, 1, 2, e}-valued row r yields^la Boolean sublattice B C 2^. Thus /(w) 
equals the number of Boolean sublattices of 2^. As detailed in [IS], this interpretation 
of f(w) yields 

f{w) = Be££(w + 2) - Be££(w + 1) 

where the nth Bell number Be££(n) gives the number of set partitions of a n-element set. 
For instance /(3) = Be££(5) — Be££(4) = 52 — 15 = 37. Indeed, besides twenty seven 
{0, 1, 2}- valued rows there are three rows of type (*, e, e), (e, *, e), (e, e, *) respectively 
(where * is 0, 1, 2), plus the row (e, e, e). 

It readily follows from (5.47) in [O] that Be££(w + 2) — Be££(w + 1) is asymptotically equal 
to Be££{w + 2) as w — > 00. For all large enough w it e.g. holds that 

T « ( w °-")(- ' 99 ) < Be££(w + 2)<w w « 2^) 

3.2 Counting all /c-element transversals within a row 

In order to calculate the number 

r k = r k (H) 
2 Notice that \B\ < \r\ but B % r for t> 0. Of course B = r for f = 0. 
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of all /c-element transversals of a hypergaph HonW, define 

Card(r,A;) := \{X G r : \X\ = k}\ 

for any {0, 1, 2, e}-valued r. Obviously is the sum of all Card(r, k) where r ranges over 
all final rows produced by the transversal e-algorithm. For r fixed, let us first determine 
the range of fc's for which Card(r, k) ^ 0. With notation as in (1) set 

(4) c min (r) := min{|X| : iGr} = (3 + t. 

Put X max — W\ zeros(r). Then X max G r and X C X max for all X G r, whence 

WW := max{|X| : X G r} = |X max | = iu - a. 

By (3) it is easy to compute |r|, but now we fix k G {c min (r), . . . , c max (r)} and strive for 
Card(r, k). The extreme cases k* = c min (r) and k* = c max (r) are trivial: 

(5) Card(r, A;*) = £i,£ 2 • • and Card(r, fc*) = 1. 

Computing Card(r, k) when k* < k < k* is more subtle. It is an exercise (carried out 
in [W4]) to apply inclusion-exclusion and obtain Card(r, k) as an alternating sum of 2* 
binomial coefficients. Unless r is long and t is small this method is inferior to the following 
manner, particularly when Card(r, k) is needed for subsequent values of k. We illustrate 
it on 

r := (ei,ei, e 2 ,e 2 ,e 2 , e 3 ,e 3 ,e 3 , e4,e 4 ,e4,e 4 ) 
and for w — 12 and 1 < /c < 5: 
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Table 2: Calculating r fc recursively 

The line 2,1,0,0,0 gives the number of sets in (ei,e±) having cardinality 1,2,3,4,5 re- 
spectively. The next line gives the number of sets in (ei, e±, e 2 , e 2 , e 2 ) having these car- 
dinalities, and so forth. In general, if ci, c 2 , • • • , c^-i are the numbers of sets in the seg- 
ment (e 1 , • • • , ei, • • • , e s _i, • • • , e s _i) having cardinality 1, 2, • • • , k — 1 respectively, then the 
number of sets in the extended segment (e±, ■ • • ,e 1 , • • • , e s _i, • • • , e s _i, e s , • • • , e s ) having 
cardinality k equals 

(6) ( £ 1 s ) Cfe - 1 + (2)^-2+ •••+(';) c ^ 

This also holds for k < e s provided we put q := for % < 0. For instance, if we take s = 3 
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and k = 5 in vq, then (6) evaluates to 



C4 + ( I J c 3 + ( I ] c 2 = 3 • 5 + 3 • 9 + 1 • 6 = 48. 



1/ V 2 / V 3 

As to the calculation of binomial coefficients of type (f) , (X) , ■ ■ • , (f) , they are conveniently 
calculated as follows: 

A e, (. £ ,) = ( £ ) £ ~^ for l<j<e-l. 



V \j + V \jj j + 1 

By first multiplying with e —j and then dividing by j + 1 one stays in the realm of integers. 
Doing this for e = E\ up to e = e s requires (e\ — 1) + • • • + {e a — 1) < w multiplications and 
just as many integer-valued divisions. Applying the 0(w\og w log log w) = 0(w\og 2 w) 
(for shortness) Schonhage-Strassen algorithm for multiplying two u>-digit numbers (see 
Wikipedia), the at most w many required binomial coefficients can be readied in time 
0(u> 2 log 2 w), and they occupy space 0(w 2 ). 



Theorem 1: Let r be a {0, 1, 2, e}-valued row of length w and let K < w. 

Then it costs space 0(w 2 ) and time 0(Kw 2 \og w) to compute the K numbers Card(r, 1) 

up to Card(r, K). 



Proof. We assume that r consists only of t many e-bubbles, so a = (3 = 7 = in (1). Other 
choices of a, (3, 7 only cause trivial adaptions. As seen, preparing the binomial coefficients 
occuring in (6) costs 0(w 2 log 2 w). For fixed s < t consider an initial segment of e-bubbles 
(ei, • • • , e%, ■ ■ • , e s , • • • e s ) of lengths £1, • • • , £ s respectively. If Card'(r, k) is the number of 
/c-element sets represented by this segment then, as seen in (6), calculating Card'(r, k) 
involves e s many multiplications of pairs of previously determined at most u>-digit numbers 
(and e s — 1 free additions), whence costs 0(e s w log 2 w). Doing this for 1 < k < K 
gives O (Ks s w log 2 w). Summing up yields 0(KeiW log 2 w) + ■■■ + 0(Ke t w log 2 w) = 
0(Kw 2 log 2 w). 



It is easy to see that the described method to calculate Card(r, k) amounts^ to expanding 
a product of some obvious polynomials associated to the e-bubbles of r. For r this gives 

(2x + x 2 ) (3x + 3x 2 + x 3 ) 2 (Ax + Qx 2 + Ax 3 + x 4 ) 

= 72a; 4 + 288x 5 + 534a; 6 + 594a; 7 + 431a; 8 + 208a; 9 + 65a; 10 + 12a; 11 + a; 12 . 

Here Card(r ,4) = Card(r ,fc*) = £\£2^a = 72 and Card(r , 12) = Card(r ,/c*) = 1 
match (5), and Card(r ,5) = 288 matches Table 2. 



3 The author adopted this polynomial point of view and the matching Mathematica command 
Expand [•• •] to get the numbers Card(r, k). Whatever the underlying method of Expand[- • •], for our 
small values of w that hardwired command likely beats a high level Mathematica implementation of the 
0(w 2 log 2 w) method from Theorem 1. 
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3.3 Generating all /c-element transversals within a row 



As to generating all fc-element members of a {0, 1, 2, e}-valued row r, let us look at 

r = (2,e 2 ,ei,2, l,e 2 ,ei,0,e 2 ) 

and k — 6. Similar to before we apply recursion according to the partition 

{5} = ones(r), {1, 4} = twos(r), {3, 7} (for d), {2, 6, 9} (for e 2 ). 

Additionally we employ a last in first out (LIFO) stack management. Namely, the stack 
starts out with a single "root object" x = ({5}, {1,4}, [0,2]). This is a cryptic command 
that in the next step x needs to split into four sons whose first components are, respec- 
tively, the subsets of {1,4} with cardinality between and 2 joined to {5}. Each son's 
second component is the next block of the partition (here {3, 7}). This gives rise to the 
height four stack in Fig. 1. Notice that [1,2] rather than [0,2] occurs three times be- 
cause eiei (as opposed to 22) forbids the empty set. More subtle, in the bottom object 
({5}, {3, 7}, [2, 2]) the entry [2, 2] demands that only {3, 7} itself may eventually be added 
to {5} (because otherwise the final cardinality k = 6 cannot be reached). 

The philosophy of LIFO being that always only the top record of the stack is treated, 
the second stack gives rise to the third stack in Fig. I. Its top object gives rise to the 
final A;-sets {5,1,4,3,7,2}, {5,1,4,3,7,6}, {5,1,4,3,7,9}. After the next two new top 
objects have each given rise to three final fc-sets, the stack has ({5, 4}, {3, 7}, [1, 2]) as its 
top object. Splitting it yields the fourth stack in Fig. 1. And so on and so forth. 



{5}, {1,4}, [0,2] 



— > 



{5, 1,4}, {3, 7}, [1,2] 



{5, 4}, {3, 7}, [1,2] 



{5,1}, {3, 7}, [1,2] 



{5}, {3, 7}, [2, 2] 



{5, 1,4, 3, 7}, {2, 6, 9}, [1,1] 



{5, 1,4, 7}, {2, 6, 9}, [2, 2] 



{5, 1,4, 3}, {2, 6, 9}, [2, 2] 



{5, 4}, {3, 7}, [1,2] 



{5,1}, {3, 7}, [1,2] 



{5}, {3, 7}, [2, 2] 



{5, 4, 3, 7}, {2, 6, 9}, [2, 2] 



{5, 4, 7}, {2, 6, 9}, [3, 3] 



{5, 4, 3}, {2, 6, 9}, [3, 3] 



{5,1}, {3, 7}, [1,2] 



{5}, {3, 7}, [2, 2] 



Fig. 1: Generating all /c-element transversals with LIFO 



Theorem 2: Let r be a {0, 1, 2, e}-valued row of length w and let k < w 

be fixed. Then the sets X e r with \X\ = k can be generated in time 0(w 2 Card(r, k)). 



Proof. We first make precise how the top object (A, B, in the sketched LIFO al- 
gorithm is to be split. Here A C W is the accumulated target set, and B C W is the 
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e-bubble to some e m e m • • • e m (see (1)), and by induction is the appropriate subinter- 
val of the integer interval [1, \B\}. The "sons" of (A, B, must be of type (C, D, [*, *]) 
where D is the e-bubbkB to e m+ i ■ • • e m+ i, and C can be any of the sets A U B' where 
B' ranges over all subsets of B with cardinality between i and j. What is the interval 
[*, *] for a particular fixed C? Recalling that k is the final cardinality to be achieved, and 
putting 8 :— k — \C\, a moment's thought shows that 

[*,*] = [max(l, S - e m+2 e t ), min(e m+1 , 6 - a)) 

where a is the cardinality of {m + 2, m + 3, • • • , t}. 

Running the LIFO algorithm amounts to building a rooted tree T whose leaves correspond 
to the Card(r, k) sets X G r with \X\ = k. The unique path from a leaf X to the root 
hence traces t + 2 nodes. For instance: 

X = {5, 1,4, 3, 7, 2} -> {{5,1,4,3,7}, {2, 6, 9}, [1, 1]) -> 

({5, 1,4}, {3, 7}, [1,2]) ({5}, {1,4}, [0,2]). 

These nodes correspond to the objects that were split to create X. The claim follows 
from \T\ < (t + 2)Card(r, k) < w Card(r, k) and the fact that each object in T requires 
work 0(w), as is clear from the above. ■ 

It is easy to see that 0(w 2 ) is the maximum size of the LIFO stack in Fig.l; this height 
can be much smaller than Card(r, k). 



3.4 The special case k = k m [ n 

The important transversal number of a set system % is defined as 

fcminCH) := min{|X| : X G Tr(U)} 

For instance, finding the minimum number of pieces necessary in a set covering problem 
amounts to determine k min = fcmin (Ji) for some associated hypergraph %. Note that 
k inin as well as r min := r fcmin can be gleaned at once from a representation of Tr{%) by 
{0, 1, 2, e}-valued rows. For instance, with respect to Table 1 we get from (4) that: 

k min = min{c min (r 1 ), • • • , c min (r 7 )} 

= min{0 + 4, 2 + 2, 2 + 2, 4 + 1, 3 + 1, 3 + 2, 4 + 1} = 4. 
Using (5) that gives 

Tmin = r 4 = Card(r 1; 4)+ Card(r 2 ,4)+ Card(r 3 ,4)+ Card(r 5 ,4) 

= (2 -2 -4- 3) + (4- 3) + (2 -2) +2 = 66. 

4 For convenience we assume that m + l,m + 2 are still < t. Otherwise special cases arise that are 
similarly handled. 
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It is evident that also generating all transversals X with \X\ = k min can be done more 
smoothly than in Section 3.3. The minimum-cardinality transversals constitute a sub- 
family of the popular [EMG] inclusion-minimal transversals. The e-algorithm seems to 
be predestined to handle that subfamily, although it isn't easy to formally assess its 
performance (work in progress). 



4 The transversal e-algorithm in theory 

If a seventh constraint corresponding to say H 7 = {3,4,5} were to be imposed in Table 
1, this would cause the cancellation of r 3 to r 7 , and so the work to produce these (mul- 
tivalued) rows would have been in vain. Fortunately such costly deletions of rows can 
be prevented by looking ahead. Specifically, any POE-produced row is called feasible if it 
contains at least one model Xq. Because r is the disjoint union of its "candidate sons" 
r[e], r[0, e], r[0, 0, e] and so forth (Section 2), at least one of them will remain feasible. As 
opposed to other applications of the POE, here feasibility is easily tested. Namely, r is 
feasible if and only if 

(7) (VI < i < h) Hi £ zeros(r). 

Obviously (7) is necessary, and it is sufficient because then X max — W \ zeros (r) is a 
model. The non-feasible sons can hence be deleted right away. More generally, fix k E [w] 
and call r extra feasible if it contains a model of cardinality > k. The above remarks 
constitute the essence of the proof of Theorem 3. 



Theorem 3: Let "H be a (w, /i)-hypergraph, and let k E [w]. Then the 
transversal e-algorithm can be adapted to calculate: 

a) The number N of all transversals of % in time 0(Nh 2 w 2 ); 

b) The number of N of all at least A;-element transversals of V. in time O (Nkh 2 w 2 log 2 w). 



Proof. As before we think of r = (2, 2, • • • , 2), with components labelled by the elements 
of W = [w], as the powerset of W. Initially the "working stack" solely comprises the row 
r with the pointer PC(r ) = 1 (where PC stands for pending constraint). Note that r 
is extra feasible since W E r . Generally, the top row r of the working stack is treated 
as follows. If PC(r) = j (for some j E [h]) then the hyperedge Hj E H is "imposed" 
upon r, which means that the set U of all X E r with X D Hj ^ is represented as a 
disjoint union of s < w many rows ri, • • • , r s . According to [Wl, Section 5], this is always 
possible. (Section 2 of the present article illustrates the most subtle case.) Writing U as 
r\ U r 2 U • • ■ U r s costs O(sw) = 0(w 2 ). Because r was extra feasible by induction, at least 
one of its candidate sons Tj will be as well. Since the extra feasibility of r 3 - amounts to 
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the truth of both (7) and |X max | > k, it costs O(shw) = 0{hw 2 ) to sieve the sons of r, 
i.e. the extra feasible rows amoung rj., • • • , r s . Altogether the cost of one imposition of a 
constraint upon a row is 0(w 2 ) + 0(hw 2 ) = 0(hw 2 ). 

The R final rows can be viewed as the leaves of a tree with root (2, 2, • ■ ■ 2) that has height 
h; each imposition triggers all sons of some node. Therefore the number of impositions is 
at most Rh (distinct final rows possibly having some of their h forfathers coinciding). It 
follows that producing the R final rows costs 0(Rh ■ hw 2 ) = 0(Nh 2 w 2 ) in view of R < N, 
by the disjointness of final rows. Counting all transversals within a row costs 0(w) by 
(3), whence doing it for all rows costs O(Nw) = 0(Nh 2 w 2 ). This yields claim (a). 

As to (b), by Theorem 1 it costs 0(kw 2 \og 2 w) to count the 

|r| — Card(r, 1) — Card(r, 2) — • • • — Card(r, k — 1) 

many transversals X G r with |X| > k. Doing it for all final rows costs 0(Nkw 2 log 2 w). 
Claim (b) thus follows from 

0(Nh 2 w 2 ) + 0(Nkw 2 log 2 w) = 0(Nkh 2 w 2 log 2 w). □ 

As is clear from the proof, the 0(Nkh 2 w 2 log 2 w) bound can be improved to 0(Rkh 2 w 2 log 2 w) 
where R < N is the mentioned number of final {0, 1, 2, e}-valued rows. Albeit in practise 
R is often much smaller than N, the only obvious theoretic upper bound of R is N. If 
rather than counting we mustEl generate all relevant transversals one by one, then we have 
no choice between R and N but are stuck with the latter. 

Let s max be the maximum number of sons of a multivalued row that occurs in any fixed 
run of the POE (whether e-algorithm or something else). According to [Wl, Thm.6] using 
a LIFO stack management (akin to Section 3.3) reduces the space requirement of POE- 
counting to 0(hws max ). It is easy to see that for the e-algorithm one has s max < min{<i, ^} 
where d := max{|if,| : 1 < i < h}, and so 0(hws max ) = 0{hw 2 ) is independent of N. 

Notice that A is a transversal of Hi, ■ ■ • , H^ if and only if its complement X c = W \ X 
is a noncover in the sense that X c 2 H% for all 1 < i < h. Although the e-algorithm 
can thus count (or generate) noncovers, it pays to introduce the symbolism nn ■ ■ -n : = 
"at least one 0" and a corresponding noncover n-algorithm which produces the noncovers 
"directly", not as X c . The noncover n-algorithm in turn generalizes to the Horn n- 
algorithm of [Wl] which counts the models of any given Horn formula. Because Theorem 
3a and Theorem 3b above correspond to not so obvious special (and dualized) cases of 
[Wl, Thm.2] respectively [Wl, Thm.7], we deemed it worthwile to offer a fresh proof. 
Even more so because (7) is much smoother than the corresponding feasibility test for 
general Horn formulae. Theorem 4 below transfers further results of [Wl] about fixed 
cardinality models to our framework. Its proof is omitted (being along the lines of the 
proof above) but we mention that Theorem 1 and Theorem 2 are used throughout. They 

5 In practise, generating all of them is mainly necessary for exact optimization, but then one rather 
generates them bunch-wise in multivalued rows. 
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appeared already as statements (16) and (15) in [Wl], but their proofs were postponed 
to the present article. 



Theorem 4: Let H be a (w, /z)-hypergraph and let k G [w]. To avoid trivial 
special cases we assume that the number N of various models considered below, is > 0. 
Define R < N as the number of final rows delivered by the transversal e-algorithm 
when applied to %. 

(a) [Wl, Thm.10] The number N of transversals of H, with |Jf | = k can be calculated in 
time 0(R2 h hw 4 k). 

(a 1 ) [Wl, remark to Thm.10] The N transversals of % with \X\ = k can be generated in 
time 0(N2 h hw 5 ). 

(b) [Wl, Thm.8] Suppose that h < k < w. Then the number N of transversals X of H 
with |X| = k can be calculated in time 0(Rkh 2 w 3 ). 

(&') [Wl, Thm.4] Suppose that h < k < w. Then the N transversals of % with \X\ = k 
can be generated in time 0(Nh 2 w 2 ). 

(c) [Wl, Thm.9] Suppose the number of fc'-element transversals increases as k' ranges 
from w down to k. Then the number N of "H-transversals X with \X\ = k can 

be calculated in time 0(Nh 2 w 5 ). 



5 Conclusion 

In [W4], which is a somewhat verbose preliminary version of the present article, a Mathe- 
matica implementation of the e-algorithm is pitted against Mathematica implementations 
of (a) inclusion-exclusion, (b) lexicographic generation, and (c) the "hardwired" whence 
advantaged Mathematica command Satisf iabilityCount. The latter is based on binary 
decision diagrams (BDD's). 

Broadly speaking, the e-algorithm combines the advantages of inclusion-exclusion and 
Satisf iabilityCount without adopting their disadvantages. Let r be the number of all 
transversals. The advantage of inclusion-exclusion is that calculating all (1 < k < w) 
doesn't take much longer than calculating r (for fixed h time scales about proportional 
to w), its disadvantage the ominous factor 2 h . The advantage of Satisf iabilityCount 
is its benign exponential dependence on h. Its disadvantage is the inability of BDD's to 

6 The 0(Kw 2 log 2 w) bound in Theorem 1 actually improves upon the 0(Kw 3 ) bound in [Wl, (16)]. 
This entails that (a 1 ) in Theorem 4 above could be slightly improved accordingly; we omitted it in order 
to minimize confusion. 
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handle fixed-cardinality constraints. 



Albeit some of the experimential results in [W4] remain interesting, the author also accepts 
the following criticism of one Referee: 



Satisf iabilityCount is a function to count the solutions of a satisfiability 
problem, and transversals are only a special case, so the function is "abused" 
(in particular when lots of artificial constraints are added to find transversals 
of a certain size!) to perform a task it was not programmed for. 



But then again, the principle of exclusion (Section 1) continues to tease Satisf iabilityCount 
when the issue is counting (let alone generating) the models of an arbitrary Boolean func- 
tion in CNF, provided it happens to have few or no models. This is work in progress, and 
so are other applications of POE. If Mathematica code algorithms compare favorably with 
corresponding hardwired Mathematica commands, obviously the former algorithms are 
inherently superior. It has been suggested (fairly or not) that Mathematica commands 
aren't state of the art, and hence the author's POE-algorithms should be implemented 
in C + (say) and compared to existing Complementations. Being not familiar with C + 
(and too lazy to learn), I leave that worthwile task to others. 

See also Section 9 in [Wl] for further analysis of the pros and cons of POE. 
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